Phase 6.2: async outbound connect — eliminate 3s vCPU stall#78
Merged
Conversation
9 bite-sized tasks covering the TcpStream::connect_timeout(3s) removal from the vCPU TX path: - New TcpNatState::Connecting state. - Non-blocking socket via socket2 + EINPROGRESS handling. - EPOLLOUT-driven completion in relay_pending_connects, called from drain_to_guest before relay_tcp_nat_data. - getsockopt(SO_ERROR) checks the actual connect outcome on EPOLLOUT readiness. - EpollDispatch::modify (EPOLL_CTL_MOD) flips Write→Read on successful connect. - CONNECT_TIMEOUT (3s) reaping for stuck Connecting flows (silent firewall drop). - Two new pins: connect-to-unreachable-doesn't-block-others (BROKEN_ON_PURPOSE → flips at Task 5) + async-RST-on-failure. - One new bench: process_syn_during_pending_connects parametric on N pending connecting flows (O(1) regression gate). Severity: MEDIUM-HIGH. Today TcpStream::connect_timeout(addr, 3s) on the vCPU thread freezes ALL guest networking for up to 3s when one destination is slow/unreachable.
…ows (BROKEN_ON_PURPOSE)
…ndshakes Replace the synchronous TcpStream::connect_timeout(3s) on the vCPU thread with a non-blocking socket2 connect that returns EINPROGRESS immediately. Flows are inserted with TcpNatState::Connecting and their fd registered for EPOLLOUT. EPOLLOUT-driven completion (Task 5: relay_pending_connects) will promote them to SynReceived and send SYN-ACK. An unreachable destination no longer blocks all other guest networking for 3 seconds.
…connects) Add EpollDispatch::modify (EPOLL_CTL_MOD) to atomically switch a registered fd's event interest from Write to Read without a DEL+ADD window. Add relay_pending_connects, called from drain_to_guest before relay_tcp_nat_data, which drives all pending Connecting flows: checks SO_ERROR, sends SYN-ACK and transitions to SynReceived on success, or RST and Closed on failure. Update rebuild_epoll_from_flow_table to reap Connecting entries post-snapshot (the underlying socket fd is dead after restore). The BROKEN_ON_PURPOSE pin tcp_connect_to_unreachable_does_not_block_other_flows now passes.
Verifies that connecting to a recently-dropped listener port eventually delivers a RST to the guest via relay_pending_connects's SO_ERROR path. Already passes after Task 5 lands; pinned now to guard the behavior.
Add Connecting-timeout detection to relay_tcp_nat_data's timeout sweep. Flows stuck in Connecting for longer than CONNECT_TIMEOUT (3 s — matching the pre-Phase-6.2 synchronous connect_timeout behavior) are reaped: a RST is sent to the guest and the flow table entry is removed. This handles the silent-firewall-drop case where EPOLLOUT never fires.
Add insert_synthetic_connecting_entry bench-helper to SlirpBackend and add the process_syn_during_pending_connects parametric bench (args: 0, 10, 100, 1000 pending connects). Validates that the SYN-handler cost is O(1) in pending-connect backlog size — only flow_table.insert + epoll.register, both O(1).
The import was only consumed at the bench-helpers-gated process_syn_during_pending_connects bench (Task 8). Default-feature clippy --bench network failed with -D warnings because the import is unused when bench-helpers is off. Quickest fix: qualify the single bare-name use as smoltcp::wire::Ipv4Address (matches the other call sites in the file) and drop Ipv4Address from the use list.
This was referenced May 6, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this branch does
Replaces the synchronous
TcpStream::connect_timeout(addr, 3s)on the vCPU thread with a non-blocking connect + EPOLLOUT-driven completion on the net-poll thread. The vCPU thread is never blocked on connect again.Severity: Medium-High — today, a guest opening a connection to ONE unreachable destination freezes ALL guest networking for up to 3 seconds (the
connect_timeout). DNS misconfigurations, transient NAT failures, or one slow destination among many freeze the whole pipeline.Headline win
connect_timeoutThe new
BROKEN_ON_PURPOSEpintcp_connect_to_unreachable_does_not_block_other_flowsflips at the EPOLLOUT-completion commit (91947a3—feat(slirp): EPOLLOUT-driven async connect completion).Architecture
TcpNatState::Connectingstate.socket2::Socket::new(IPV4, STREAM.nonblocking, TCP)→connect()returnsEINPROGRESS→ insert flow withstate = Connecting, register FD withRegisterMode::Write→ return immediately to vCPU.EPOLLOUTreadiness →relay_pending_connectschecksgetsockopt(SO_ERROR):SynReceived, send SYN-ACK to guest, modify epollWrite→Read.CONNECT_TIMEOUT(3 s) reaping for stuckConnectingflows (silent firewall drop) — uses Phase 6.1'slast_state_changefield.EpollDispatch::modify(EPOLL_CTL_MOD) flipsWrite→Readon connect completion.Bench evidence
scripts/bench-compare.sh --baseline 47868f0 --skip-vm:process_syn_during_pending_connects/0process_syn_during_pending_connects/10process_syn_during_pending_connects/100process_syn_during_pending_connects/1000port_forward_accept_latencypoll_with_n_mixed_flows/999tcp_bulk_throughput_1mbWall-clock vs master
(Master baseline measured pre-6.x; current
mainalready includes Phase 6.4 epoll dispatch (#69), Phase 6.1 half-close (#76), and port-forward listener on epoll (#77), so this PR's incremental delta vs the newmainis the async-connect win specifically — the headline wall-clock numbers are the cumulative phase-6 stack.)Commits (10)
Cherry-picked clean from
smoltcp-passt-port-phase6.2-async-connectonto currentmain:docs: Phase 6.2 detailed TDD plan — async outbound connectchore: add socket2 dep for non-blocking connect(Cargo.lock regenerated)feat(slirp): add TcpNatState::Connecting + guest_isn fieldtest(network): pin tcp_connect_to_unreachable_does_not_block_other_flows (BROKEN_ON_PURPOSE)feat(slirp): non-blocking connect — Connecting state for in-flight handshakesfeat(slirp): EPOLLOUT-driven async connect completion (relay_pending_connects)— flips the BROKEN_ON_PURPOSE pintest(network): pin tcp_connect_async_eventual_rst_on_failurefeat(slirp): CONNECT_TIMEOUT reaping for stuck Connecting flowsbench(network): process_syn_during_pending_connects (Phase 6.2 baseline)fix(bench): drop unused Ipv4Address import; qualify the one use site(The Phase 6.2 empty-marker validation-gate commit was skipped during cherry-pick.)
Test plan
cargo fmt --all -- --check— cleancargo clippy --workspace --all-targets --all-features -- -D warnings— cleanRUSTDOCFLAGS="-D warnings" cargo doc --no-deps --all-features— cleancargo test --test network_baseline -- --test-threads=1— 22/22 (was 20; +2 connect pins)cargo test --test network_baseline --features bench-helpers -- --test-threads=1— 24/24 stable across 4/5 runs (1/5 hits the pre-existingtcp_port_forward_inbound_connect_succeedsparallel-bind flake unrelated to this change)Stacked follow-up
Phase 6.3 (TCP window management) will rebase onto post-6.2
mainnext.Replaces draft #74 (same async-connect content via the now-superseded #73 chain).